Skip to content

perf(continuations): Verify-and-drop continuation epoch proofs instead of collecting them#766

Open
nicole-graus wants to merge 4 commits into
mainfrom
continuation-verify-and-drop
Open

perf(continuations): Verify-and-drop continuation epoch proofs instead of collecting them#766
nicole-graus wants to merge 4 commits into
mainfrom
continuation-verify-and-drop

Conversation

@nicole-graus

Copy link
Copy Markdown
Collaborator

Motivation

prove_and_verify_continuation proved every epoch and kept all proofs in a Vec before verifying, so retained proof memory grew with the number of epochs.

Description

Verify each epoch inline right after proving it and drop its MultiProof, keeping only the small boundary / l2g_root / public_output needed for the final global proof. Retained per-epoch proof memory is now O(1) instead of O(#epochs). The epoch loop is extracted into a shared EpochDriver so prove_continuation (the serialized bundle producer) is unchanged, and the streaming verifier derives is_final/label positionally to stay a faithful mirror of verify_continuation.

@nicole-graus

Copy link
Copy Markdown
Collaborator Author

/ai-review

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown

Codex Code Review

Found one issue:

Medium - prove_and_verify_continuation still retains two full epoch proofs.
In prover/src/continuation.rs, the streaming loop keeps epoch alive, then calls driver.next()? before verifying and destructuring/dropping epoch. That means current and next EpochProof/MultiProof are resident at the same time, and proving the next epoch also runs while the current proof is retained. For large epoch sizes, this can roughly double retained proof memory and undermine the PR’s stated “single epoch” memory bound.

Actionable fix: have EpochDriver::next return the epoch’s trusted is_final metadata, or expose halted state after proving the current epoch, so the loop can verify/drop the current proof before proving the next one. The standalone verifier can still derive is_final positionally for untrusted bundles.

I did not run builds or tests, per the review constraints.

@claude

claude Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Reviewed the streaming verify-and-drop refactor. This is a careful, well-documented change and I found no significant correctness, safety, or performance issues.

Verified invariants:

  • is_final derivation is consistent between the two paths. The driver bakes is_final = executor.pc() == 0 into the AIRs (HALT only on the final epoch), while the streaming verifier re-derives it positionally via one-epoch lookahead (next.is_none()). These are identical by construction — next() returns None iff pc == 0 after the prior epoch — so no honest run is falsely rejected, and any disagreement (e.g. an unexpected resume_with_limitNone mid-execution) is caught by verify_epoch rather than silently accepted. Good defensive design.

  • Label/index sync holds under lookahead. The streaming index (for epoch_label) increments once per verified epoch and matches the driver.index used at production time, even though driver.next() runs one epoch ahead.

  • prove_continuation is behaviorally unchanged. The extracted EpochDriver preserves the MAX_EPOCHS guard, the one-time init_page_data/provenance snapshot, the power-of-two padding invariant, and the resume → None termination semantics.

  • Memory claim is legitimate. Per-epoch MultiProofs are dropped after inline verification; only boundary/l2g_root/public_output (all inherently needed by prove_global / verify_l2g_commitment_binding) are retained — same as verify_continuation already holds.

  • Parity with verify_continuation. The added MAX_PRIVATE_INPUT_SIZE guard and the all_boundaries.is_empty() (zero-epoch) early-return correctly mirror the standalone verifier, and the check ordering (verify_epoch before trusting l2g_root) is preserved. The reg_fini.len() malformed-bundle check is correctly omitted since epochs are produced in-process.

Test coverage is strong — the anti-drift test (test_streaming_matches_two_phase) with an independently-pinned known-good output, the zero-epoch case, and the exactly-2-epoch lookahead boundary all guard the failure modes this refactor introduces.

Minor (informational, no action needed): because of the one-epoch lookahead, when a verify_epoch fails the driver has already proved the following epoch (wasted work). This only triggers on a failing proof, which cannot happen for an honest in-process self-verify, so the cost is effectively never paid.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown

AI Review

PR #766 · 2 changed files

Findings

Status Sev Location Finding Found by
candidate low prover/src/continuation.rs:36 Docstring/doc says "single epoch" but lookahead holds two epoch proofs minimax
minimax/MiniMax-M3
candidate low prover/src/continuation.rs:1233 test_streaming_exactly_two_epochs can panic on zero/one-cycle programs minimax
minimax/MiniMax-M3

Status column reflects the verdict from the verifier: deepseek-verifier (openrouter/deepseek/deepseek-v4-pro).

AI-001: Docstring/doc says "single epoch" but lookahead holds two epoch proofs
  • Status: candidate
  • Severity: low
  • Location: prover/src/continuation.rs:36
  • Found by: minimax:minimax/MiniMax-M3
  • Verified by: -
  • Rejected by: -

Claim

The doc (module-level and design doc §8) advertises that the streaming path bounds retained-proof memory to "a single epoch at a time", but the one-epoch lookahead loop in prove_and_verify_continuation actually holds two EpochProofs simultaneously: the current pending and the lookahead next.

Evidence

Lines 975–977: let mut pending = driver.next()?; while let Some(epoch) = pending { let next = driver.next()?; ... }. After the current epoch is destructured/dropped at line 1000–1006, only next remains; before that destructuring, both are live. Same claim at docs/continuations_design.md:465 ("a single epoch at a time instead of holding all N"). The bound is O(1) epochs, not strictly 1 — minor imprecision, not a bug.

Suggested fix

Soften the wording to "bounded to O(1) epochs" or "one epoch in steady state, briefly two across the is_final lookahead" to match the actual retention.

AI-002: test_streaming_exactly_two_epochs can panic on zero/one-cycle programs
  • Status: candidate
  • Severity: low
  • Location: prover/src/continuation.rs:1233
  • Found by: minimax:minimax/MiniMax-M3
  • Verified by: -
  • Rejected by: -

Claim

(total as u64 - 1).ilog2() panics on Rust when the argument is 0; the subsequent assert!(epoch_size_log2 >= 2, ...) runs too late to guard against this. For any program with total ≤ 1 cycles, the test panics rather than asserting.

Evidence

Line 1233 calls ilog2() on total - 1 before line 1235's bound check. u64::ilog2(0) panics in Rust, and total - 1 == 0 for total ∈ {0, 1}. For the specific all_loadstore_32 fixture (~34 cycles) this is never reached, but the test is structured to derive epoch_size_log2 from any program — a short fixture triggers a panic instead of a clean assertion failure.

Suggested fix

Use total.checked_sub(1).and_then(|n| (n >= 1).then(|| n.ilog2())) (or guard with assert!(total >= 2)) so the check is performed before the ilog2 call.

Reviewer Lanes

Lane Model Prompt Status Findings
glm openrouter/z-ai/glm-5.2 general error: opencode failed (provider/auth/runtime error) and no findings were submitted 0
kimi openrouter/moonshotai/kimi-k2.7-code general error: opencode failed (provider/auth/runtime error) and no findings were submitted 0
minimax minimax/MiniMax-M3 general success 2
moonmath zro/minimax-m3 general error: agentic lane timed out after 1800s 0
nemotron openrouter/nvidia/nemotron-3-ultra-550b-a55b general error: opencode failed (provider/auth/runtime error) and no findings were submitted 0

Verification Lanes

Lane Model Status Confirmed Rejected Uncertain
deepseek-verifier openrouter/deepseek/deepseek-v4-pro error: opencode failed (provider/auth/runtime error) and no verifications were submitted 0 0 0

Native Codex and Claude reviews run separately and post their own comments. They are not included in this structured provenance report.

Raw lane outputs, candidates, final issues, and model metrics are uploaded as workflow artifacts.

…778)

- Clarify the private-input guard comment: the parity it buys is with
  verify_continuation (the standalone verifier); the two-phase composition
  prove_continuation + verify_continuation instead fails inside Executor::new
  with Error::Execution before that bound is reached. Add the missing
  regression test pinning the guard's InvalidTableCounts variant.
- Fail loudly (ContinuationInvariant) if resume_with_limit ever returns None
  mid-run instead of ending the driver cleanly: a clean end would let
  prove_continuation emit a bundle whose last epoch lacks HALT — never
  verifiable, with no diagnostic at the cause. Unreachable today (None only
  occurs with pc == 0 at entry, already guarded), so no behavior change.
- Add reciprocal sync notes between verify_continuation's epoch loop and its
  inline mirror in prove_and_verify_continuation: the differential test only
  exercises honest proofs, so it cannot catch a rejection check dropped from
  one of the two copies.
- Document the deliberate reg_fini length-guard asymmetry (in-process fini is
  trusted; a wrong length is an invariant bug that should fail loudly, not be
  reported as a rejected proof).
- Qualify the lookahead comment (positional is_final buys mirror fidelity,
  not independence from the in-process executor) and the design doc's 'small
  boundary' claim (boundaries grow with total touched memory; the O(1) bound
  is on retained proof data).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants